3,365 research outputs found

    Enumeration of RNA structures by Matrix Models

    Full text link
    We enumerate the number of RNA contact structures according to their genus, i.e. the topological character of their pseudoknots. By using a recently proposed matrix model formulation for the RNA folding problem, we obtain exact results for the simple case of an RNA molecule with an infinitely flexible backbone, in which any arbitrary pair of bases is allowed. We analyze the distribution of the genus of pseudoknots as a function of the total number of nucleotides along the phosphate-sugar backbone.Comment: RevTeX, 4 pages, 2 figure

    Similarity-Detection and Localization

    Full text link
    The detection of similarities between long DNA and protein sequences is studied using concepts of statistical physics. It is shown that mutual similarities can be detected by sequence alignment methods only if their amount exceeds a threshold value. The onset of detection is a continuous phase transition which can be viewed as a localization-delocalization transition. The ``fidelity'' of the alignment is the order parameter of that transition; it leads to criteria for the selection of optimal alignment parameters.Comment: 4 pages including 4 figures (308kb post-script file

    Model for Folding and Aggregation in RNA Secondary Structures

    Get PDF
    We study the statistical mechanics of RNA secondary structures designed to have an attraction between two different types of structures as a model system for heteropolymer aggregation. The competition between the branching entropy of the secondary structure and the energy gained by pairing drives the RNA to undergo a `temperature independent' second order phase transition from a molten to an aggregated phase'. The aggregated phase thus obtained has a macroscopically large number of contacts between different RNAs. The partition function scaling exponent for this phase is \theta ~ 1/2 and the crossover exponent of the phase transition is \nu ~ 5/3. The relevance of these calculations to the aggregation of biological molecules is discussed.Comment: Revtex, 4 pages; 3 Figures; Final published versio

    Quantification of the differences between quenched and annealed averaging for RNA secondary structures

    Get PDF
    The analytical study of disordered system is usually difficult due to the necessity to perform a quenched average over the disorder. Thus, one may resort to the easier annealed ensemble as an approximation to the quenched system. In the study of RNA secondary structures, we explicitly quantify the deviation of this approximation from the quenched ensemble by looking at the correlations between neighboring bases. This quantified deviation then allows us to propose a constrained annealed ensemble which predicts physical quantities much closer to the results of the quenched ensemble without becoming technically intractable.Comment: 9 pages, 14 figures, submitted to Phys. Rev.

    A New Simulated Annealing Algorithm for the Multiple Sequence Alignment Problem: The approach of Polymers in a Random Media

    Full text link
    We proposed a probabilistic algorithm to solve the Multiple Sequence Alignment problem. The algorithm is a Simulated Annealing (SA) that exploits the representation of the Multiple Alignment between DD sequences as a directed polymer in DD dimensions. Within this representation we can easily track the evolution in the configuration space of the alignment through local moves of low computational cost. At variance with other probabilistic algorithms proposed to solve this problem, our approach allows for the creation and deletion of gaps without extra computational cost. The algorithm was tested aligning proteins from the kinases family. When D=3 the results are consistent with those obtained using a complete algorithm. For D>3D>3 where the complete algorithm fails, we show that our algorithm still converges to reasonable alignments. Moreover, we study the space of solutions obtained and show that depending on the number of sequences aligned the solutions are organized in different ways, suggesting a possible source of errors for progressive algorithms.Comment: 7 pages and 11 figure

    Nature of the glassy phase of RNA secondary structure

    Full text link
    We characterize the low temperature phase of a simple model for RNA secondary structures by determining the typical energy scale E(l) of excitations involving l bases. At zero temperature, we find a scaling law E(l) \sim l^\theta with \theta \approx 0.23, and this same scaling holds at low enough temperatures. Above a critical temperature, there is a different phase characterized by a relatively flat free energy landscape resembling that of a homopolymer with a scaling exponent \theta=1. These results strengthen the evidence in favour of the existence of a glass phase at low temperatures.Comment: 7 pages, 1 figur

    Exact solution of the Bernoulli matching model of sequence alignment

    Full text link
    Through a series of exact mappings we reinterpret the Bernoulli model of sequence alignment in terms of the discrete-time totally asymmetric exclusion process with backward sequential update and step function initial condition. Using earlier results from the Bethe ansatz we obtain analytically the exact distribution of the length of the longest common subsequence of two sequences of finite lengths X,YX,Y. Asymptotic analysis adapted from random matrix theory allows us to derive the thermodynamic limit directly from the finite-size result.Comment: 13 pages, 4 figure

    An O(n^3)-Time Algorithm for Tree Edit Distance

    Full text link
    The {\em edit distance} between two ordered trees with vertex labels is the minimum cost of transforming one tree into the other by a sequence of elementary operations consisting of deleting and relabeling existing nodes, as well as inserting new nodes. In this paper, we present a worst-case O(n3)O(n^3)-time algorithm for this problem, improving the previous best O(n3logn)O(n^3\log n)-time algorithm~\cite{Klein}. Our result requires a novel adaptive strategy for deciding how a dynamic program divides into subproblems (which is interesting in its own right), together with a deeper understanding of the previous algorithms for the problem. We also prove the optimality of our algorithm among the family of \emph{decomposition strategy} algorithms--which also includes the previous fastest algorithms--by tightening the known lower bound of Ω(n2log2n)\Omega(n^2\log^2 n)~\cite{Touzet} to Ω(n3)\Omega(n^3), matching our algorithm's running time. Furthermore, we obtain matching upper and lower bounds of Θ(nm2(1+lognm))\Theta(n m^2 (1 + \log \frac{n}{m})) when the two trees have different sizes mm and~nn, where m<nm < n.Comment: 10 pages, 5 figures, 5 .tex files where TED.tex is the main on
    corecore